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Amendments to the Claims 
This listing of claims will replace all prior versions of claims in the application: 
Listing of Claims: 

1 . (Currently Amended) A system that facilitates extracting data in connection with 
spam processing, comprising: 

a computer readable storage medium comprising: 

a component that receives a message and extracts a set of features 
associated with some part, content or content type of a message; and 

an analysis component that at least examines consecutiveness of 
characters within a subject line of the message or at least oxaminos and a content 
type of the message for spam in connection with building a filter, wherein the 
content type is case-sensitive, comprises primary content-type and a secondary- 
content type, or combinations thereof. 

2. (Original) The system of claim 1, the analysis component determines frequency of 
consecutive repeating characters within the subject line of the message. 

3. (Original) The system of claim 2, the characters comprise letters, numbers, or 
punctuation. 

4. (Original) The system of claim 1, the analysis component determines frequency of 
white space characters within the subject line of the message. 

5. (Original) The system of claim 1, the analysis component determines distance 
between at least one alpha-numeric character and a blob. 

6. (Original) The system of claim 1, the analysis component determines a maximum 
number of consecutive, repeating characters and stores this information. 
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7. (Original) The system of claim 1, the analysis component establishes ranges of 
consecutive, repeating characters, the ranges corresponding to varying degrees of spaminess, 
whereby messages can be sorted by their respective individual count of consecutive repeating 
characters. 

8. (Cancelled) 

9. (Previously Presented) The system of claim 1 , the analysis component compares 
the content type of a current message to stored content types of a plurality of other messages to 
facilitate determining whether the message is spam. 

10. (Cancelled) 

11. (Cancelled) 

12. (Original) The system of claim 1, the analysis component further determines time 
stamps associated with the message. 

13. (Original) The system of claim 12, the analysis component determining a delta 
between time stamps. 

14. (Original) The system of claim 13, the delta is between a first and a last time 

stamp. 

15. (Original) The system of claim 1, the analysis component determines at least one 
of: a percentage of white space to non- white space in the subject line of the message and a 
percentage of non- white space and non-numeric characters that are not letters in the subject line 
of the message. 

16. (Original) The system of claim 1, the filter being a spam filter. 
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17. (Original) The system of claim 1, the filter being a parental control filter. 

18. (Original) The system of claim 1, further comprising a machine learning system 
component that employs at least a subset of extracted features to learn at least one of spam and 
non-spam. 

19. (Withdrawn) A system embodied on a computer readable storage medium that 
facilitates extracting data in connection with spam processing, comprising: 

a component that receives an item and extracts a set of features indicative of spam 
associated with a message, at least one of the features is a normalized time delta; and 

an analysis component that determines whether an embedded message or 
attachment is associated with the message. 

20. (Withdrawn) The system of claim 19, the analysis component identifies a type of 
embedded message or attachment to facilitate predicting whether the message is spam. 

21 . (Withdrawn) The system of claim 19, further comprising a component that 
employs at least a subset of the extracted features to populate at least one feature list. 

22. (Withdrawn) The system of claim 21, the at least one feature list is any one of a 
list of good users, a list of spammers, a list of positive features indicating legitimate sender, and a 
list of features indicating spam. 

23. (Withdrawn) The system of claim 19, further comprising a component that 
examines at least a portion of a message body. 

24. (Withdrawn) The system of claim 23, the component examines at least a 
beginning portion of the message body. 
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25. (Withdrawn) The system of claim 23, the component determines at least one of: a 
percentage of white space to non-white space in the message body and a percentage of non-white 
space and non-numeric characters that are not letters in the message body. 

26. (Withdrawn) The system of claim 23, the component determines a percentage or a 
number of consecutive lines of a message body to examine. 

27. (Withdrawn) The system of claim 23, the component examines the message body 
for the presence of at least one blob or consecutive, repeating characters. 

28. (Withdrawn) The system of claim 27, the characters comprising letters, 
punctuation, and numbers. 

29. (Withdrawn) A computer-readable storage medium that performs a method that 
facilitates spam detection and prevention, the method comprising: 

receiving a plurality of messages, the plurality comprising at least a first 
and a second message; 

extracting at least a subset of information from the plurality of messages, the 
information being from at least one of a subject line, a content-type header, a received header, 
and a message body; 

analyzing the subset of information to generate one or more features to facilitate 
training a filter; 

determining time stamps associated with the message; and 

determining a delta between a first time stamp and a last time stamp, the first time 
stamp being located in a Received header and the last time stamp being located in a Date 
header at the message's destination. 

30. (Withdrawn) The method of claim 29, analyzing the subset of information 
comprises determining a number of consecutive repeating characters within the subject line or 
the message body of the message. 
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3 1 . (Withdrawn) The method of claim 30, the characters comprise letters, numbers, or 
punctuation. 

32. (Withdrawn) The method of claim 29, analyzing the subset of information 
comprises determining a frequency of white space characters within the subject line of the 
message. 

33. (Withdrawn) The method of claim 29, analyzing the subset of information 
comprises determining a distance between at least one alpha-numeric character and a blob. 

34. (Withdrawn) The method of claim 29, analyzing the subset of information 
comprises: 

determining a maximum number of consecutive, repeating 
characters and storing this information; and 

establishing ranges of consecutive, repeating characters, the ranges 
corresponding to varying degrees of spaminess, whereby messages can be sorted by their 
respective individual count of consecutive repeating characters. 

35. (Withdrawn) The method of claim 29, analyzing the subset of information 
comprises: 

determining content type associated with the message; and 
comparing the content type of a current message to stored content types of 
a plurality of other messages to facilitate determining whether the message is spam. 

36. (Cancelled) 

37. (Withdrawn) The method of claim 29, analyzing the subset of information 
comprises determining a percentage or a number of consecutive lines of a message body to 
examine at least one of: a percentage of white space to non-white space in the subject line of the 
message and a percentage of non-white space and non-numeric characters that are not letters in 
the subject line of the message. 
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38. (Withdrawn) The method of claim 29, analyzing the subset of information 
comprises determining whether an embedded message or an attachment exists in the message 
and identifying a type of embedded message or attachment to facilitate predicting whether the 
message is spam. 

39. (Withdrawn) The method of claim 29, analyzing the subset of information 
comprises examining at least a beginning portion of the message body. 

40. (Withdrawn) A computer-readable medium having stored thereon the following 
computer executable components: 

a component that receives a message and extracts a set of features associated with 
some part, content or content type of a message, wherein at least one content type is case- 
sensitive, comprises primary content-type and a secondary-content type, or combinations thereof; 

an analysis component that examines at least consecutiveness of characters within 
a subject line of the message in connection with building a filter; 

a component that determines a delta between a first time stamp and a last time 
stamp associated with the message, the first time stamp and the last time stamp are normalized to 
a coordinated universal time; 

a component that determines whether an embedded message or attachment is 
associated with the message; and 

a component that determines a percentage or a number of consecutive lines of a 
message body to examine and that examines the message body for the presence of at least one 
blob or consecutive, repeating characters. 

41 . (Withdrawn) A system embodied on one or more computers that facilitates 
extracting data in connection with spam processing, comprising: 

means for receiving a plurality of messages, the plurality comprising at least a 
first and a second message; 
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means for extracting at least a subset of information from the plurality of 
messages, the information being from at least one of a subject line, a content-type header, a 
received header, and a message body; and 

means for analyzing the subset of information to generate one or more features to 
facilitate training a filter, the means for analyzing the subset of information comprising: 

means for determining a number of consecutive repeating characters within the 
subject line or the message body of the message; 

means for determining a delta between a first time stamp and a last time 
stamp associated with the message, the first time stamp and the last time stamp are 
normalized to a coordinated universal time; 

means for determining whether an embedded message or an attachment 
exists in the message and identifying a type of embedded message or attachment to 
facilitate predicting whether the message is spam; and 

means for determining a percentage or a number of consecutive lines of a 
message body to examine at least one of: a percentage of white space to non-white space 
in the subject line of the message and a percentage of non-white space and non-numeric 
characters that are not letters in the subject line of the message. 
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